Consensus Training for Consensus Decoding in Machine Translation
نویسندگان
چکیده
We propose a novel objective function for discriminatively tuning log-linear machine translation models. Our objective explicitly optimizes the BLEU score of expected n-gram counts, the same quantities that arise in forestbased consensus and minimum Bayes risk decoding methods. Our continuous objective can be optimized using simple gradient ascent. However, computing critical quantities in the gradient necessitates a novel dynamic program, which we also present here. Assuming BLEU as an evaluation measure, our objective function has two principle advantages over standard max BLEU tuning. First, it specifically optimizes model weights for downstream consensus decoding procedures. An unexpected second benefit is that it reduces overfitting, which can improve test set BLEU scores when using standard Viterbi decoding.
منابع مشابه
Fast Consensus Hypothesis Regeneration for Machine Translation
This paper presents a fast consensus hypothesis regeneration approach for machine translation. It combines the advantages of feature-based fast consensus decoding and hypothesis regeneration. Our approach is more efficient than previous work on hypothesis regeneration, and it explores a wider search space than consensus decoding, resulting in improved performance. Experimental results show cons...
متن کاملModel Combination for Machine Translation
Machine translation benefits from two types of decoding techniques: consensus decoding over multiple hypotheses under a single model and system combination over hypotheses from different models. We present model combination, a method that integrates consensus decoding and system combination into a unified, forest-based technique. Our approach makes few assumptions about the underlying component...
متن کاملThe RWTH Aachen System for NTCIR-10 PatentMT
This paper describes the statistical machine translation (SMT) systems developed by RWTH Aachen University for the Patent Translation task of the 10th NTCIR Workshop. Both phrase-based and hierarchical SMT systems were trained for the Japanese-English and Chinese-English tasks. Experiments were conducted to compare standard and inverse direction decoding, the performance of several additional m...
متن کاملCollaborative Decoding: Partial Hypothesis Re-ranking Using Translation Consensus between Decoders
This paper presents collaborative decoding (co-decoding), a new method to improve machine translation accuracy by leveraging translation consensus between multiple machine translation decoders. Different from system combination and MBR decoding, which postprocess the n-best lists or word lattice of machine translation decoders, in our method multiple machine translation decoders collaborate by ...
متن کاملLearning Translation Consensus with Structured Label Propagation
In this paper, we address the issue for learning better translation consensus in machine translation (MT) research, and explore the search of translation consensus from similar, rather than the same, source sentences or their spans. Unlike previous work on this topic, we formulate the problem as structured labeling over a much smaller graph, and we propose a novel structured label propagation f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009